17. 5. 2022

What to expect

  • Why visualizations rather than text
  • General principles of visualization
  • Grammar of graphics

Why visualizations rather than text

Florence Nightingale and the Crimean War (1850s)

Source Highcharts.com

John Snow and the cholera epidemy on London

Source Highcharts.com

Count all the threes

Source Ware (2012)

Count all the threes now

Source Ware (2012)

We remember better visually

Source Medina (2014)

General principles of visualization

Eduard Tufte

A key figure in the modern approach to visualization of information.

Chartjunk; data : ink ratio; data density; micro - macro reading.

General principles of visualization

  1. Emphasis on data
  2. Readability
  3. Integrity

PRINCIPLE 1: Emphasis on data

PRINCIPLE 1: Emphasis on data

Less is more. Graphs are meant to communicate information effectively, design is meant to support that goal, not obscure it.

Source Harford (2021)

DATA : INK ratio

“Above all else show the data.” (Eduard Tufte)

Source

More is less

Sometimes a little extra ink is worth it…

Do not use 3D charts

This is not just an unnecessary effect, but it actively harms

Do not rely on default

Excel pie chart

Source of funding [thousands CZK]

Source of funding [thousands CZK]

Excel pie chart - emphasis

Financování dle zdroje, v tisících Kč

Financování dle zdroje, v tisících Kč

Excel bar chart

BEFORE

AFTER

Excel bar chart - time series

BEFORE

AFTER

Excel Line Chart

BEFORE

AFTER

Excel Likert scale (diverging chart)

BEFORE

AFTER

Excel Likert scale - alternative version (diverging chart)

BEFORE

AFTER

PRINCIPLE 2: Readability

PRINCIPLE 2: Readability

Pie charts are not suitable for making comparisons

% university-educated in new EU members

Example of improved readability and emphasis

“Small multiples” improve readability of time series

If you have a flexible tool, you can be creative…

Careful with this one…

May be useful for two categories

Well-managed data density

PRINCIPLE 3: Integrity

PRINCIPLE 3: Integrity

You decide what message visualization brings to the forefront. But you are also responsible for possible distortions or manipulations.

How much are the prices of flats rising?

The big problem with the y axis

axis y in -20 mil. (top), in 0 (bottom)axis y in -20 mil. (top), in 0 (bottom)

axis y in -20 mil. (top), in 0 (bottom)

Sometimes the y-axis is arbitrary

Sometimes, we just need to “zoom”

SO if the y-axis does not start at 0 …

Should we be the least worried about poverty of all European countries?

Visualizing uncertainty

Data from July 2021

Uncertainty can also be visualized in model estimates

A true visualization, BUT…

General principles of visualization - SUMMARY

Emphasis on data

  • Default settings often need to be changed
  • Keep only those chart elements that have an informational value
  • Do not use 3D charts
  • Think about what you want the chart to say

Readability

  • Respect human cognition
  • Horizontal chart labels are better than vertical
  • Think about the context in which the reader encounters the chart
  • Be inspired by creative approaches

Integrity

  • Be careful with the y-axis
  • Communicate the meaning of what you visualize
  • Take into account the degree of uncertainty

Visualization architecture (grammar of graphics)

Leland Wilkinson and ‘The Grammar of Graphics’ (book)

What makes a good visualization? Individual components…

  1. Data
  2. Variables
  3. Algebra
  4. Scale
  5. Geometry (line chart, bar chart, …)
  6. “Aesthetics” (colors, shapes, saturation, …)

Hadley Wickham and developing a software solution of Wilkinson’s ideas

ggplot2

Seven chart layers. Three required:

  1. Data

  2. Aesthetics - mapping information to color, shape, saturation, …

  3. Geometry - graphic elements that represent data

Four “extra”:

  1. Facets (small multiples)

  2. Aggregated statistics (e.g. regression curve)

  3. Coordinate editing (e.g. logarithmic scale)

  4. Theme (theme) - chart design

Data

## # A tibble: 6 x 8
##   species island bill_length_mm bill_depth_mm flipper_length_~ body_mass_g sex  
##   <fct>   <fct>           <dbl>         <dbl>            <int>       <int> <fct>
## 1 Adelie  Torge~           39.1          18.7              181        3750 male 
## 2 Adelie  Torge~           39.5          17.4              186        3800 fema~
## 3 Adelie  Torge~           40.3          18                195        3250 fema~
## 4 Adelie  Torge~           NA            NA                 NA          NA <NA> 
## 5 Adelie  Torge~           36.7          19.3              193        3450 fema~
## 6 Adelie  Torge~           39.3          20.6              190        3650 male 
## # ... with 1 more variable: year <int>
ggplot(data = penguins)

Aesthetics

  • Axes
  • Outline
  • Fill
  • Size
  • Transparency
  • Shape

ggplot(data = penguins, 
       aes(x = sex))

Geometry

  • lines
  • points
  • columns
  • histogram
  • boxplot

ggplot(data = penguins, 
       aes(x = sex)) + 
  geom_bar()

Geometry 2

  • lines
  • points
  • columns
  • histogram
  • boxplot

ggplot(data = penguins %>% 
         filter(!is.na(sex)), 
       aes(x = sex,
           y = bill_length_mm)) + 
  geom_boxplot() +
  theme_classic()

Galery 1

Galery 2

Galery 3

Galery 4

Galerie 5

Courtesy

This presentation naturally draws on a hard-to-imagine volume of work of a hard-to-imagine number of people.

Nevertheless, I would especially like to thank Petr Bouchal. With him, in 2016, we prepared a course on the methodology of science at the summer academy for high school students Discover, where we devoted a lot of space to visualization. Petr was also a guest lecturer in my courses at Faculty of Arts, CU, and it was only during his lectures that I fully appreciated the value of seeing visualization as a full-fledged auxiliary scientific discipline. I became acquainted with a number of examples in this presentation thanks to Petr

Additional resources - principles and applications

Additional resources - working with ggplot2

Referenced literature and other sources

If the resources referenced in the presentation are not interactive (they do not contain a link directly to their location), you can find them in the list here:

Harford, Tim. 2021. How to Make the World Add up: Ten Rules for Thinking Differently about Numbers. 1st edition. London: The Bridge Street Press.

Medina, John. 2014. Brain Rules (Updated and Expanded): 12 Principles for Surviving and Thriving at Work, Home, and School. Second edition. Seattle, WA: Pear Press.

Ware, Colin. 2012. Information Visualization: Perception for Design. 3rd edition. Waltham, MA: Morgan Kaufmann.